AITopics | optimal loss

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsFeb-11-2026, 08:11:54 GMT

We would like to emphasize that Theorem 1 is the most important contribution of our paper due to its generality

We thank the reviewers for their insightful feedback, and we appreciate the opportunity to improve our paper. We would like to emphasize that Theorem 1 is the most important contribution of our paper due to its generality. In the Gaussian case, our sample complexity result follows directly from the expression for the optimal loss. Finally, while Dohmatob's bounds become non-trivial only when the adversarial We will also add a clearer description of the "translate and pair in place" coupling. Comparisons with Sinha et al. are in Section 7 and we compare to Dohmatob above.

artificial intelligence, important contribution, machine learning, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Neural Information Processing SystemsOct-9-2025, 02:30:04 GMT

Characterizing the Optimal 0 1 Loss for Multi-class Classification with a Test-time Attacker

Finding classifiers robust to adversarial examples is critical for their safe deployment.

artificial intelligence, machine learning, optimal loss, (18 more...)

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsOct-1-2025, 23:07:13 GMT

We would like to emphasize that Theorem 1 is the most important contribution of our paper due to its generality

artificial intelligence, important contribution, machine learning, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

arXiv.org Artificial IntelligenceJun-23-2025

Optimal Online Bookmaking for Any Number of Outcomes

Tal, Hadar, Sabag, Oron

We study the Online Bookmaking problem, where a bookmaker dynamically updates betting odds on the possible outcomes of an event. In each betting round, the bookmaker can adjust the odds based on the cumulative betting behavior of gamblers, aiming to maximize profit while mitigating potential loss. We show that for any event and any number of betting rounds, in a worst-case setting over all possible gamblers and outcome realizations, the bookmaker's optimal loss is the largest root of a simple polynomial. Our solution shows that bookmakers can be as fair as desired while avoiding financial risk, and the explicit characterization reveals an intriguing relation between the bookmaker's regret and Hermite polynomials. We develop an efficient algorithm that computes the optimal bookmaking strategy: when facing an optimal gambler, the algorithm achieves the optimal loss, and in rounds where the gambler is suboptimal, it reduces the achieved loss to the optimal opportunistic loss, a notion that is related to subgame perfect Nash equilibrium. The key technical contribution to achieve these results is an explicit characterization of the Bellman-Pareto frontier, which unifies the dynamic programming updates for Bellman's value function with the multi-criteria optimization framework of the Pareto frontier in the context of vector repeated games.

artificial intelligence, bookmaker, optimization problem, (20 more...)

2506.16253

Country: Europe (0.28)

Genre: Research Report (0.40)

Industry:

Banking & Finance (0.65)
Leisure & Entertainment > Gambling (0.46)
Leisure & Entertainment > Sports > Football (0.45)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

arXiv.org Machine LearningJun-17-2025

Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value

Xu, Yixian, Luo, Shengjie, Wang, Liwei, He, Di, Liu, Chang

Diffusion models have achieved remarkable success in generative modeling. Despite more stable training, the loss of diffusion models is not indicative of absolute data-fitting quality, since its optimal value is typically not zero but unknown, leading to confusion between large optimal loss and insufficient model capacity. In this work, we advocate the need to estimate the optimal loss value for diagnosing and improving diffusion models. We first derive the optimal loss in closed form under a unified formulation of diffusion models, and develop effective estimators for it, including a stochastic variant scalable to large datasets with proper control of variance and bias. With this tool, we unlock the inherent metric for diagnosing the training quality of mainstream diffusion model variants, and develop a more performant training schedule based on the optimal loss. Moreover, using models with 120M to 1.5B parameters, we find that the power law is better demonstrated after subtracting the optimal loss from the actual training loss, suggesting a more principled setting for investigating the scaling law for diffusion models.

artificial intelligence, diffusion model, machine learning, (18 more...)

arXiv.org Machine Learning

2506.13763

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceDec-6-2023

Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker

Dai, Sihui, Ding, Wenxin, Bhagoji, Arjun Nitin, Cullina, Daniel, Zhao, Ben Y., Zheng, Haitao, Mittal, Prateek

Finding classifiers robust to adversarial examples is critical for their safe deployment. Determining the robustness of the best possible classifier under a given threat model for a given data distribution and comparing it to that achieved by state-of-the-art training methods is thus an important diagnostic tool. In this paper, we find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset. We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints. We further define other variants of the attacker-classifier game that determine the range of the optimal loss more efficiently than the full-fledged hypergraph construction. Our evaluation shows, for the first time, an analysis of the gap to optimal robustness for classifiers in the multi-class setting on benchmark datasets.

classifier, hyperedge, optimal loss, (16 more...)

2302.10722

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Shinkawa, Hiroaki, Chauvet, Nicolas, Röhm, André, Mihana, Takatomo, Horisaki, Ryoichi, Bachelier, Guillaume, Naruse, Makoto

Conflict-free joint sampling for preference satisfaction through quantum interference

arXiv.org Artificial IntelligenceDec-8-2022

Collective decision-making is vital for recent information and communications technologies. In our previous research, we mathematically derived conflict-free joint decision-making that optimally satisfies players' probabilistic preference profiles. However, two problems exist regarding the optimal joint decision-making method. First, as the number of choices increases, the computational cost of calculating the optimal joint selection probability matrix explodes. Second, to derive the optimal joint selection probability matrix, all players must disclose their probabilistic preferences. Now, it is noteworthy that explicit calculation of the joint probability distribution is not necessarily needed; what is necessary for collective decisions is sampling. This study examines several sampling methods that converge to heuristic joint selection probability matrices that satisfy players' preferences. We show that they can significantly reduce the above problems of computational cost and confidentiality. We analyze the probability distribution each of the sampling methods converges to, as well as the computational cost required and the confidentiality secured. In particular, we introduce two conflict-free joint sampling methods through quantum interference of photons. The first system allows the players to hide their choices while satisfying the players' preferences almost perfectly when they have the same preferences. The second system, where the physical nature of light replaces the expensive computational cost, also conceals their choices under the assumption that they have a trusted third party. This paper has been published in Phys. Rev. Applied 18, 064018 (2022) (DOI: 10.1103/PhysRevApplied.18.064018).

artificial intelligence, machine learning, probability matrix, (14 more...)

doi: 10.1103/PhysRevApplied.18.064018

2208.03082

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Flynn, Cheryl J., Hurvich, Clifford M., Simonoff, Jeffrey S.

On the Sensitivity of the Lasso to the Number of Predictor Variables

arXiv.org Machine LearningMay-27-2016

The Lasso is a computationally efficient regression regularization procedure that can produce sparse estimators when the number of predictors (p) is large. Oracle inequalities provide probability loss bounds for the Lasso estimator at a deterministic choice of the regularization parameter. These bounds tend to zero if p is appropriately controlled, and are thus commonly cited as theoretical justification for the Lasso and its ability to handle high-dimensional settings. Unfortunately, in practice the regularization parameter is not selected to be a deterministic quantity, but is instead chosen using a random, data-dependent procedure. To address this shortcoming of previous theoretical work, we study the loss of the Lasso estimator when tuned optimally for prediction. Assuming orthonormal predictors and a sparse true model, we prove that the probability that the best possible predictive performance of the Lasso deteriorates as p increases is positive and can be arbitrarily close to one given a sufficiently high signal to noise ratio and sufficiently large p. We further demonstrate empirically that the amount of deterioration in performance can be far worse than the oracle inequalities suggest and provide a real data example where deterioration is observed.

artificial intelligence, deterioration, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1214/16-STS586

1403.4544

Country: North America > United States > New York (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Mukherjee, Indraneel, Rudin, Cynthia, Schapire, Robert E.

The Rate of Convergence of AdaBoost

arXiv.org Artificial IntelligenceJun-29-2011

The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration $t$, the exponential loss of AdaBoost's computed parameter vector will be at most $\epsilon$ more than that of any parameter vector of $\ell_1$-norm bounded by $B$ in a number of rounds that is at most a polynomial in $B$ and $1/\epsilon$. We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within $C/\epsilon$ iterations, AdaBoost achieves a value of the exponential loss that is at most $\epsilon$ more than the best possible value, where $C$ depends on the dataset. We show that this dependence of the rate on $\epsilon$ is optimal up to constant factors, i.e., at least $\Omega(1/\epsilon)$ rounds are necessary to achieve within $\epsilon$ of the optimal exponential loss.

adaboost, artificial intelligence, machine learning, (16 more...)

1106.6024

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Oceania > Australia > Queensland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)